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PREFACE 

This report describes part of a comprehensive and continuing program 
of research concerned with advancing the state-of-the-art in remote' sensing 
of the environment from aircraft and satellites. The research is being 
carried out for the NASA Lyndon B, Johnson Space Center, Houston, Texas, 
by the Environmental Research Institute of Michigan (ERIM) , formerly the 
Willow Run Laboratories of The University of Michigan. The basic objective 
of this multidisciplinary program is to develop remote sensing as a practical 
tool to provide the planner and decision-maker with extensive information 
quickly and economically. 

Timely information obtained by remote sensing can be important to such 
people as the farmer, the city planner, the conservationist, and others 
concerned with problems such as crop yield and disease, urban land studies 
and development, water pollution, and forest management. The scope of our 
program includes: (1) extending the understanding of basic processes; 

(2) discovering new applications, developing advanced remote-sensing 
systems, and improving automatic data processing to extract information in 
a useful form; and also 03) assisting in data collection, processing, 
analysis, and ground-truth verification. 

The research described here was performed under NASA Contract NAS9-14123, 
Task VII, and covers the period from 15 May 1974 through 14 March 1975, 

Dr, Andrew Potter has been Technical Monitor. The program was directed by 
R. R. Legault, Vice-President of ERIM, by J.D. Erickson, Project Director 
and Head of the Information Systems and Analysis Department, and by 
R.F. Nalepka, Principal Investigator and Head of the Multispectral Analysis 
Section, The ERIM number for this report is 109600-14-F. 

The authors wish to acknowledge the direction provided by Mr. R.R. Legault, 
Dr. J.D. Erickson, and Mr 4 R.F. Nalepka, Many constructive discussions were 
held with H, Horwitz, R.J. Kauth, W, Richardson, and many others at ERIM, 
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1 

SUMMARY 

This report covers the continuation of a study of the use of a 
Kalman filter for adaptive processing. Included are analytical and test 
results pertaining to LANDSAT data. An earlier report [1] describes the 
first portion of the study, including, test results using aircraft MSS data.- 

The purpose of adaptive processing is to continuously update the mean 
vectors of the class signatures using the data itself to provide the 
updating thereby allowing a local signature to have more universal 
applicability for classification. Originally, the Kalman filter was chosen 
because it provided an ordered structure into which many attractive ad-hoc 

updating techniques could be fitted and better understood, with others 

/ 

perhaps being derived, A limitation usually found in Kalman filters, the 
requirement for excessive processing time and computer memory when a large 
number of states must be updated, was circumvented by the use of a simplified 
form of the filter, (Section 3,2 contains an equivalence relationship 
between the normal and simplified forms.) 

In one form of the Kalman filter, the state variations are described 
as a Markov process. Occasionally, this description is replaced with 
an assumption of correlated variations. Because states are identified with 
the mean vectors of the class signatures, the correlated variation 
assumption was used , with a correlation length corresponding approximately 
to agricultural field sizes. 

In our earlier report, we presented test results on the Kalman filter 
form of adaptive processing that were obtained by using aircraft MSS data. 

In that report, we showed that the modifications to the basic Kalman filter 
appropriate to decision-directed classification would perform the various 
optional functions for which they were designed such as line-by-line updating 
rather than pixel-by-pixel. In this report, it is shown that the same 
conclusion can be reached from the LANDSAT data test results . 


[1] Crane, R.B., Adaptive Processing with a Decision-Directed Kalman 
Filter and Feature Extraction of Multispectral Data, Report No. 
190100-31-T, Environmental Research Institute of Michigan, 

Ann Arbor, July 1974. 
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In one test it was found that the signature means did not have to 
be the means of the data classes, With adaptive processing the means 
can vary approximately 20% without degrading classification accuracy 
below that obtained with the measured means and non-adaptive processing. 
This result Indicates that adaptive processing may he used to accomplish 
limited signature extension (i,e. , to permit signature established under 
one set of measurement conditions to be successfully applied under slightly 
different condition)., For situations -where mean values differ by more than 
.20% and other signature extension techniques could transform the means 
.so that their values were within the 20% range, adaptive processing could 
be used to find even better values. The same test showed that if the 
means were outside of a 35% range of values, then non— adaptive processing 
outperformed adaptive processing. This result is not particularly 
significant because unusuably poor classification accuracy was obtained 
in either case. In fact, several test results are available that show that 
adaptive processing cannot improve classification accuracy when the non- 
adaptive classification accuracy is poor. An explanation may he that- the 
Kalman filter is decision directed, requiring that most of the data points 
be classified accurately so that the points can be directed to update the 
correct means. 
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2 

BACKGROUND AND INTRODUCTION 

Multispectral scanner data acquired by remote sensing will usually 
cover large areas on the ground. The' large amounts of data collected make 
machine- processing extremely desirable for extracting information. To 
use machine processing, one must rely on some commonality within the data, 
usually ■ spectral invariance among ground covers of the same class. The 
quality of the extracted information is limited by the extent to which 
such commonality exists, .as well as by limited knowledge of the commonality. 

One method of processing multispectral scanner data to extract 
information is to select data subsets, called training sets, for each of the 
classes, and use these to derive a signature for that class. The signature 
consists of a mean vector and a covariance matrix describing the data vectors 
from the training sets. Then, assuming that the signatures for each class 
accurately represent that class, a decision rule is defined which assigns 
each of the scanner data points to one of the classes (or to a' null class). 

There are many reasons why a class signature may not accurately represent 
that particular class at all times arid places; 1) there may be an insufficient 
number of data points in the training set; 2) the class may be composed of 
subclasses with different reflectances and the signature "could describe 
only one subclass; 3) atmospheric conditions may not be constant between 
the training and test areas; and, 4) the illumination and viewing geometry 
(position of the sun and the sensor relative to the ground resolution 
element) could differ for training and non-training data. Signatures 
should be the best obtainable so that classification accuracy is maximized, 
thereby maximizing the information extracted from the data. 

One way to increase the accuracy of the signatures is to increase the 
number of training sets. If the training sets are located throughout the 
scene, one can use different signatures for different portions of the scene. 
However, there is a limit to the number of training sets that can be used 
which is quickly reached, beyond which the cost of obtaining the requisite 
ground truth becomes prohibitive for operational systems. 
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A second, way to increase signature accuracy is to preprocess the 
signatures or data so that variations are accounted for, removed, or 
reduced. This method can be extremely effective for certain types of 
variations, and can greatly reduce the number of training sets required 
and hence the cost, -However, this method becomes less effective when 
the variations are different for different classes, 

A third way to increase signature accuracy and thereby classification 
accuracy is to use adaptive processing to update the signatures as the data 
are being processed. The basic idea is simple. Suppose that a number of 
points are identified as belonging to a certain class, and that the average 
value of these points is larger than the mean vector of the signature for 
that class. Then the signature is changed by increasing the mean vector. 
The amount of increase depends on the number of points, the amount by which 
the average exceeds the mean vector, and a factor which will be called the 
updating rate. This basic adaptive processing method, along with some 
imaginative variations, was used for processing multispectral scanner 
data [3]. The success of the initial effort led to the use of a Kalman 
filter. 

A Kalman filter is used to update the mean vectors of the signatures 
while the data are being classified. We combine the mean vectors for all 
of the classes into one large state vector, which is then updated. The 
Kalman filter method has the advantages of being an iterative technique 
ideally suited for use- with a digital computer, and of providing a general 
formulation or organization that combines many techniques for improving 
recognition. There are, however, three potential disadvantages: 1) The 

Kalman filter normally requires large amounts of computer memory; 2) It 
requires large amounts of computation time, and 3) there is a possibility 
of "capture” , 

The reason for the large memory requirement is that the joint 
statistics of the state variables (the covariance of the state vectors) must 


[3]Marshall, R.E., F.J, Kriegler, and W, Richardson, Adaptive Multispectral 
Recognition of Wheat, Tenth Symposium on Adaptive Processes, Miami 
Beach, December 1971, 
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be stored, as must the covariance of the individual observations of these 
state variables (i;e.', the crop signature covariances), • These can be 
thought of respectively as the statistics of the underlying process (state 
vector). and the statistics of the. noise of observation of that process, 

.The reason that the Kalman filter normally requires a large amount of 
computation time is that both of the above forms of covariance enter the 
calculation. The calculation time for each update varies as the product 
of state vector calculation time and noise calculation time. If the 
basic signal is n channels and there are m materials then the state vector 
is mn in -length, . the state covariance is nm x nm, and the noise covariance 
is n x n for >each signature, - 

The statistics of the state vector were not known initially; these 
could only be learned by observation of the Kalman filter process itself. 

Hence an initial assumption had to be made: that the state covariance was 

a higher dimensional image of the average noise covariance. This assumption 
turned, out to be. fortuitous sin’ce it simplified the Kalman filter equations 
greatly, and both calculation times and memory requirements became 
negligible. 

The third possible disadvantage mentioned above was the possibility 
of "capture", wherein the mean signature for material A becomes adapted 
to the true mean of material B, In this case material B has captured the 
signature for material A and, as a result, material B systematically 
becomes classified as material A, Capture can occur when false classifications 
cause the mean vectors of one or more classes to be updated using the wrong 
data points. 

Use of the Kalman filter necessitated three assumptions, the 
justification for which is that the resulting updating equations increase 
classification accuracy. One assumption is that the recognition decisions 
are correct. Because this assumption is questionable, we borrowed a 
technique from our initial study [4] to modify the amount of updating by a 
confidence factor which reflects the uncertainity in the correctness of 
the decisions. Another assumption is that all of the covariance matrices of 

I4J Kriegler, F.J;, R,E. Marshall, H.H. Horwitz, and M.F. Gordon, Adaptive 
Multispectral Recognition of Agricultural Crops, Eighth International 
Symposium on Remote Sensing of Environment, ERIM, Ann Arbor, October 1972. 


10 



2eri 


FORMERLY WILLOW RUN LABORATORIES THE UNIVERSITY OF MICHIGAN 


the signatures are equal. This assumption is made to derive the Kalman 
filter, and is not used when the decision rule is computed. A third 
assumption that we also make, is that the covariance of any two mean 
vectors is proportional to the same common covariance matrix. We do not 
anticipate sufficient ground truth in most data sets so that more accurate 
statistics could be used. We also assume that the data are Gaussian, or 
equivalently, that we want the optimum (in the mean square sense) linear 
updating equations. 

We have built several functions into our Kalman filter. In its 
simplest form, the filter can be used to update, after each decision, the 
mean vector of the class that was recognized. One of the functions is the 
ability to update all mean vectors, when we wish to include interaction 
i.e., non-zero covariance between pairs of mean vectors.. Additionally, we 
can update after every line or fraction of a line has been classified, a 
feature which reduces the time required for processing a data set compared 
to updating after every point. We also have incorporated the confidence 
factor mentioned previously. 

Thus far the discussion has been limited to updating the signature 
mean vectors. We also have the capability of estimating and updating angular 
dependence of the data. For large data sets it may prove desirable to update 
the angular dependence, because some of the reasons for needing an updating 
algorithm; e.g,, atmospheric changes, are also reasons why the angular 
dependence of the mean vectors would change. 

Another feature of this program is the capability to use auxiliary 
training fields to improve the updating accuracy and lessen the probability 
of capture, When this feature is used, the updated value of the mean vectors 
depends not only on the original signatures and the data already processed, 
but also on signatures of training sets to be processed downstream. The 
use of auxiliary training fields should reduce the ground truth requirement, 
because signatures of the auxiliary training fields are used for data 
collected both before and after the training sets, with continuity between 
training. sets provided by the Kalman filter. 
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The original adaptive processing [3] and our first implementation of 
the Kalman filter were used with the quadratic (maximum likelihood) 
decision rule. One step in improving the adaptive processing procedure 
was to change to the linear decision rule 1 [5]. Previous comparisons of the 
two decision rules showed that when the same number of data channels were 
used, the classification accuracy 'obtained with the two rules were 
approximately the same for test data, and the linear rule required 
approximately one-third the processing time of the quadratic rule. The 
fact that the linear rule provided slightly higher classification accuracy 
was felt to be statistically insignificant. A recent comparison made as 
part of the CITARS program [6] confirms the approximate classification 
accuracy equivalence of the two decision rules. 

A test program was initiated to show the usefulness of the Kalman 
filter approach to adaptive processing. Initial results, using aircraft 
MSS data, indicated that the approach was useful for processing data 
gathered under a variety of conditions [1]. 

This report summarizes our modifications and testing of the Kalman 
filter for LANDSAT data. Simplifications of the theory presented in our 
earlier report are contained in Section 3. Also included are extensions to 
the theory that are applicable to LANDSAT data. Results of testing the 
modified Kalman filter are presented in Section 4, Conclusions and 
recommendations are presented in Section 5 which includes a recommended test 
plan applicable- to the LAC IE program. ' 


{3] Marshall, R.E., F. J, 
Recognition of Wheat, 
Beach, December 1971. 


Kriegler, and W. Richardson, Adaptive Multispectral 
Tenth Symposium on Adaptive Processes, Miami 


[5] Crane, R.B, & W. Richardson, Performance Evaluation of Multispectral 
Scanner Classification Methods, Eighth International Symposium on 
Remote Sensing of . Environment, Willow Run Laboratories of the 
Institute of Science and Technology, The University of Michigan 
Ann Arbor, October 1972. 7 


[6] Malila, William A., Daniel P-. Rice and Richard C. Cicone "Final 
Report on the CITARS Effort by .the Environmental Research 
Institute of Michigan" ERIM Report No. 109600-12-F, Environmental 
Research Institute of Michigan, Ann Arbor, Michigan, February 1975 . 


[ij Crane, R.B., Adaptive Processing with a Decision-Directed Kalman 

^^^, F f% tUre faction of Multispectral Data, Report No. 190100-31-T 
Environmental Research Institute of Michigan, July 1974. * 
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3 

DESCRIPTION OF MODIFIED KALMAN FILTER 


In previous work [i] a general formulation of the Kalman filter was 
presented and the way such a filter could be used to update the signature 
mean vectors was discussed in qualitative terms. Various terms of the 
equations were identified with observed phenomena. The dependence of the 
equations upon the statistical properties of the data was shown explicitly. 
One possible approximation to the statistics was shown to lead to a 
simplification of the equations along with greatly reduced computational 
and memory requirements of a general purpose digital computer. Also 
shown were some extensions of the filter that appeared at the time to 
be most useful for either aircraft or satellite data. 

In this section we repeat the analytical description of the Kalman 
filter, although the approach is different. We then show a general 
equivalence between the Kalman filter and a simplified filter. Finally, 
the equivalence relationship is used to show various modifications to the 
Kalman filter that have proven useful. 

3.1 BASIC KALMAN FILTER 

Beginning with one version of the Kalman filter equations, the first 
problem is to estimate X k using measurements Z q , . . , ,Z^. We identify X fc 
with a vector composed of all of the class mean vectors, and identify the 
Z± with data vectors. An optimum estimate of X k , which we shall label 
must he found which minimizes: 

M = E{[X k -g(Z)3 t [X k -g(Z)j}. (1) 

Here, g(Z) represents a function of Z Q ,...,Z k . The probability distribution 
of X k and Z can be written as 

f(X k ,Z) = f(X k |z) f (Z) ( 2 ) 


[1] Crane, R.B,, Adaptive Processing with a Decision-Directed Kalman 
Filter and Feature Extraction of Multispectral Data, Report No, 
190100-31-T, Environmental Research Institute of Michigan 
Ann Arbor, July 1974. 
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so Equation 1 becomes 


M = 


f(Z) 


[X k -g(Z)] t [X k -g(Z>] f(x k |z)dx k 


dZ 


(3) 


Now M is nonnegative, so we minimize the bracketed .integral for every Z. 
In this integral, g(Z) is a constant, .which is minimized if g(Z) is the 
mean of X k - Thus the minimum mean s.quare estimate of X k is 

\ = g(Z) ='E(X k tz) ■ (4) 


Equation 4 can be rewritten when X k and Z are. jointly gaussian in the form 
E(X k jz) = ECX^) {E(Z Z t )f 1 Z (5) 

Equation 5 is true for any pair of jointly gaussian vectors, and will 
be used repeatedly in the development. We now define an, error vector 

< 6 > 

and note that the minimization operation served to minimize the norm of 
X k i.e,, see Equation 1, Using Equation 5, it can be shown that: 

E(X k X^) = 0 (7) 

E(X k Z t ) = 0 (8) 

The Kalman filter is now developed using, the state model just described. 
It is assumed that one can update after every line of data, rather than 
after every data point. Each data point is related to the state by the 
equation 

Z ki = \i *k + v ki (9) 

where is a matrix that picks the mean vector of the class, after 
classifications, from X^ The noise vector, v^, is measurement noise, 
with statistics 
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E <\i> - 0 




( 10 ) 

( 11 ) 


One measurement vector will be used for each line, and is given by 


Li c * ±Zfci 


(12) 


The C ki are wei S htin g factors which represent the confidence that the 
classification decision for each data point was correct. 

Combining Equations. 9 and 12 gives 


\ = Vk + v k a« 

where 

\ - ! , C kAi a*> 

x=l 

N 

v k -l c ki v k± as) 

i«l 


4 -t_ 

and is a measurement noise vector for the k n line. To find the 
statistics, equations (1) and (11) are used: 


E(v k ) = 0 

E (v, v* ) = 6 . 
k 2 kj 



(16) 


We have now completed the preliminary derivation of the filter 
problem. From Equation 4 it is seen that one must find 


x k- E «kl z o- ; ->V (17) 

- B «kl Z o— -V 

where 

K - \ \ * ii as) 

- Wi - 4 Vi> + v k 
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The second equality in Equation 17 occurs because is a linear 

combination of Equation 8 .can - be used' to show that: 

E(Z.z5) = 0 
J k 

for j < k f Using Equation 5 it is seen that Equation 17 becomes 

E < x J z o V - 

- 4 Vi + 

We now use Equation 5 to find 


E »ki v - E<x k z k> 

Evaluating the separate parts of Equation 22, 

E < x k 2£> - p k 4 


where 


P," = * P, - + Q. - 

k k-1 k-1 


.*\j o>t K 

p k - E < x k V 


and 


- Vk< + E - 

Combining Equations 20-26, we rewrite Equation 20 as 

V'Vi + K k\ 


where 


MICHIGAN 

(19) 

( 20 ) 

(22) 

( 22 ) 

(23) 

(24) 

(25) 

(26) 
(27) 
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h. ' + W' 1 (28) 

*k - - W < t2 k-l - Vl> + fk’k (29) 

To complete the development, we evaluate Equation 25; 

p k - F £ - Wk (30) 


The adaptive processing equations that could be used are Equations 18, 23, 
27, 28. and 30, 


3.2 EQUIVALENCE OF SIMPLIFIED KALMAN FILTER 

The main purpose of this section is to explain why previous adaptive 
processing developments can be put into a simplified form. An additional 
purpose is to provide logical justification for, and description of, a method 
whereby new developments can be formulated directly in the simplified form. 
The advantages of this simplified form over the normal form of adaptive 
processing are reduced computation time and storage requirements, 

We start by writing the equations that define the Kalman filter that 
we have been using . 


\ Vi + Vi 

(31) 

e(<y>k) = Q k s(k,i) 

(32) 

z k = Vk + n k 

(33) 

E< ‘ n k n i t ‘ ) = b k R 

(34) 


In these equations, as before, is a state vector composed of the 
mean vectors of all of the classes that exist at a sampling time t, . The 
vector <i> k is- a random vector, normally distributed, with mean zero and 
covariance defined by Eq, 32. Equation 33 defines the measurement vector 
V identified to be the MSS data vector, as a function of 1) the state 
vector, 2) a pointing matrix, and 3) a random Gaussian vector, n^. The 
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pointing matrix uses recognition results to select the components of the 
state vector that form the mean vector of the class, that .was recognized'. 
The random vector, has zero mean and covariance defined by Eq. 34. 

Because all random processes have a normal distribution, the minimum 
mean-squared estimator for is 

\ = E(X k |Z o ,...,Z j ) (35) 

The matrix T. . is defined to be 
k,i 

T ki ‘ E Vi‘> «« 

Note that by using Eqs. 31 and 32, one can show that 


T k,k T k-l,k-l + Q k-1 


(37) 


We are now ready to evaluate Eq. 35. A vector Z is defined as 



and Eq, 35 written as 

\ = E(XJZ) = E(X k Z t )[E(ZZ t )3" 1 Z 

t t 

Components of E(X^Z ) and E(ZZ ) are 


E < x k z i> = E tV H i x i + n i> ^\i H i- 


(38) 


(39) 


(40) 


and 


E(Z 1 Z t t ) - I[(H A + 


+ b. R <S(i,i) (41) 
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The formulation of the Kalman filter is now complete. This formulation 
is equivalent to the iterative formulation that we have been using. The 
next step is to introduce a second Kalman filter which will later be 
related to the simplified form, ' 


\ = \-l + Y k-1 

(31a) 

E < Y k Y i^ - Q k 5 k,i 

(32a) 

t 


5 k = \ n k + °k 

(33a) 

E( Vi ) “ V 5(k>i) 

(34a) 

A 

\ ' E( \l ? o 

(35a) 

- s k,i 

(36a) 

S k,k " S k-l,k-l + °k-l 

(37a) 


? = 



(38a) 


\ = E(n k ? t ) 


(39a) 


E ^ n k C i ^ S k,k M i 


E < ? i0 = M i ts iA + b i 


(40a) 

(41a) 
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To relate' the two 'formulations 'we rewrite Eq. 39a as 


\ ■ v 


(42) 


where 


t\,-l 


a k = E <v t)[E(c?t)] 


(43) 


Next, we make the assumptions 

Q k - 0 k -® * 

\ - \ c © 1 


T = S (x) R 
oo oo w 


(44) 

(45) 

(46) 


where the symbol (x) represents the Kronecker product. Using Eqs. 46, 31, 
and 32, one can show that 


T ik - s ±k ® E 


Next, Eq. 41 is evaluated 


(47) 


+ b i R 6 1,1 = ( M i t 0 I )( S i £ €^)(M Jl ©l) + b ± R 6(i,JQ' 

- V s iA + h V ® R 


(48) 


and, by using Eq, 4la, we see that 


E(ZZ t ) = E(?? t ) (x)R 


(49) 
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In a similar manner. 


T ki H i “ ^ki® 5 ^ ^i® 1 ) " S ki M i® R 


(50) 


so that 


eOEjZ 11 ) = E(n k ? fc ) 0R (51) 

Combining Eqs. 39, 49, 51 and 53, results in 

\ ~ ( “k ® Z (52) 

Equation 52 is an important result. It says that if one can find the 
optimum filter for the simplified system, defined by Eqs. 31a to 41a, one 
has also found the simplified form of the filter for the system defined 
by Eqs. 31 to 41. In addition, one can also show the relationship between 
the convariance matrices for the state estimation errors for the two systmes, 
which are 


p r E|( vV W'l (53) 

p k ‘ E ><VV < VV t i - »k© E «« 

3.3 AUXILIARY FIELDS 

The use of auxiliary fields was introduced in [1]. Auxiliary fields 
provide a method of reducing the undesirable characteristic of signature 
capture in an updating method, including the Kalman filter. In multispectral 
scanner data processing, signature capture occurs when the mean of one 
class actually describes the data from another class. Data from one (or 
more) material is recognized incorrectly, clearly an undesirable situation. 

Of course, the same misclassif ication can occur without updating, and it 
is possible that updating may eliminate the problem. 
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Auxiliary fields are ground truth fields, .which are located throughout 
_ the .scene-. The mean 'values of the -data from the fields are considered to 
be additional measurement vectors, and are used to determine how much the 
mean vectors are to be updated. Because the auxiliary fields are correctly 
identified, their use should tend to overcome capture caused by incorrect 
classification of other data vectors. 

The formulation of the updating equations to include the auxiliary 
fields has already been presented. Equation 35 is interpreted so that 
the measurements Z^.,j>k denote the measurements from the auxiliary fields. 
The updating equation is (39), which is shown in simplified • form in [1]. 

3.4 COLORED NOISE 

One of the assumptions used for the Kalman filter is that variations 

in the state vector between sampling .times .are uncorrelated, (This 
} * 1 * 
assumption is described in Eq. 24.) We now describe a method whereby this 

assumption can be removed. First, the state vector is formed in the 

following manner. 



The is the vector composed of all of the class mean vectors. The 
variation between Y^ and Y^ is 

h - Y k-i + “tia (56) 

where 

“k-1 + Vl ( 57 ) 

[1] Crane, R.B., Adaptive Processing with .a Decision-Directed “Kalman 
Filter and Feature Extraction of Multispectral Datk, Report No. 
190100-31-T, Environmental Research Institute of Michigan, 

Ann Arbor, -July, 1974. 
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The basic difference between this development and that found in [1] 
can be explained by Equation 52. In the previous developments, the state 
variations for different sampling times were independent. In- this 
development, they are dependent, To show the dependence, we start with 

* ( \ a p * E n S kj <*» 

which is similar to the assumption in [1] , 

We find 

2i 

E( v£> - “ '\-iVii + yV r „ < 59 > 

1-a 

for all i < k. For large k and |a[ < 1, 


E <VV> % r-2 E n 

1-a 


(60) 


The correlation of the is 


E( “k"j ) ' “ k " a E( “k-j“k-j J 


(61) 


for k > j , If 


n 

a = e o 


(62) 


then for large k, 


_ , t\ ^ 

E j ) v e 


_ ikdi 

n 


1-a 


9 R 

2 n 


(63) 


Thus n Q can be considered to be analogous to a correlation length. 


[1] Crane, R.B„, Adaptive Processing with a Decision-Directed Kalman 
Filter and Feature Extraction of Multispectral Data, Report No. 
190100-31-T, Environmental Research Institute of Michigan, Ann 
Arbor, July 1974. 
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The simplified 
assumptions: 

equations are now found by making 

the following . 

$ 

ii 

OM-7; 

g _H, 

© 

hn 

(64) 

Q 

1! 

(65) 

R 

n 

= 0 (x)R 

(66) 

P k 

( W\ - 

- (bid.)®? 

\ k k/ 

(67) 

"ki 

- o) ©i 

(68) 

“k 

Jx C «Ai 

(69) 


The Kronecker product, @, is described and used in [1]. The vector 
M^i represents the classification decision for the data point, and is 
composed of ones and zeros. The simplification of the filter equations 

is straightforward and tedious, so will be omitted. The updating equations 
become 


Y k * \-i + vi + 

“k - “ vi + <*i2 ® Z) K 


where 


K = \- K x uK-i-K x z) vi 


(70) 

(71) 

(72) 


U] Crane, R.B., Adaptive Processing with -a. Decision-Directed Kalman 

^traction of Multispectral Data, Report No. 
190100-31-T, Environmental Research Institute of- Michigan 
Ann Arbor, July 1974.' 6 * 
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and ^^2 are defined fry 


a k M k 
& kl = “"D~ 

& k2 D 


with recursive relations: 


Vi * W b k + a k 


(yyy&yy 

D 


b k+i = “ 


b k +d k ' D 


<yy \< \ 


d Wl‘ ° 


, b X<\ 

d k 5 


+ 0 


where 


B -yA + " c L 

i=i 


(73) 

(74) 

(75) 

(76) 

(77) ’ 


(78) 


Some obvious characteristics of this technique should be noted. The 
size of the state vector is increased; with m materials and n channels, 
the state vector becomes 2mn rather than mn. The error matrix that must 
be stored becomes 4 times as large. The increased dimensionalities may 
not be too important, because the storage requirement for the filter is 
not large. Also, we do not expect a large increase in processing time. 
The filter without the correlated state variations increased processing 


25 



2pi 


FORMERLY WILLOW RUN LABORATORIES. THE UNIVERSITY OF MICHIGAN 


time, as compared to non-adaptive processing, by either 3% or 11%,. the 
smaller percentage increase occuring when the quadratic decision rule was 
used. Most of the computation time was used on operations required for 
each data point, e.g,, forming c ki z ki > rather than for the updating. 

With the new method, only the updating equations have been changed. 
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4 

EMPIRICAL EVALUATION OF MODIFIED KALMAN FILTER 

A test program was initiated to show the usefulness of the Kalman 
filter approach to adaptive processing of 'LANDSAT data. Selected data 
sets were chosen which contained agricultural data since data could not 
be tested for every conceivable application and under every possible 
atmospheric condition. Indications of the usefulness of the technique 
for other applications can be inferred from the results presented in this 
section. However, additional testing should be performed so that parameters 
can be chosen to ensure that maximum benefits can be realized. 

4.1 EFFECT OF DECISION RULE ON CLASSIFICATION ACCURACY 

The major purpose of the adaptive processing now being developed is 
to correct the decision rule used for classification of multispectral 
scanner data. We now adapt the mean signature vectors which, with the 
covariance matrices, are used to determine the decision rule. An addi- 
tional function sometimes employed is to update preprocessing transforma- 
tions by updating an estimation of the angular variation of the data. 

Insight into the usefulness -of the updating can be gained by determining 
how important it is to have the correct decision rule. We have determined 
the error rate for different decision rules applied to normal data from 
two classes with the same covariance matrix and the same a priori proba- 
bilities . 

When the covariance matrices are equal no generality is lost in 

assuming unit distance between the means because the covariance matrix of 
2 

either class is a times the identity matrix. A linear decision rule is 
assumed which can be characterized by two numbers, the distance from the 
origin, q, and the complement, <f>, of the angle that the decision plane 
crosses the line joining the two means. See Figure 1. The optimum values 
are q - 1/2 and <j> = 0°. 
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FIGURE 1. GEOMETRY OF TWO-CLASS DECISION RULE 

For different values of q, <J>, and' -a, we have computed an average 
error rate, which is the average of the two types of error (choosing the 
second class, given data from the first class and vice versa). Figure 2 
shows the effect of the choice of q-for different values of a/ cos<j>. The 
value of q is most important when cr/cos4> is small and q is not near 0.5. 
Otherwise, small changes in q have a very small effect on the error rate. 

The same equations -were plotted in Figure 3 with <j> = 45°. With such a 'large 
error in <p , it is not surprising that the choice of q is not very important 
unless a is rather small. 

The effect of <p is shown in Figures 4 and '5 for two different values 
of q. Once again, small changes in <j> have very small effect on the error 
ra te. When <p is large, there can he a large error rate. 

Recently,- there has been an interest in estimating total acreage of 
wheat from a data set. In our simple example, we can consider the proba- 
bility, P, of deciding the second class. This probability is shown in 
Figure 6, for different values of q and o/cos<f>. If a/cos<j> is very small, 
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FIGURE 3: EFFECT OF DISPLACEMENT OF THE DECISION SI 

ERROR WHEN THE SURFACE IS ROTATED 45° FRI 
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PROPORTION 
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FIGURE 6: EFFECT OF DISPLACEMENT OF THE DECISION SURFACE ON THE PROPORTION OF 

DATA POINTS IDENTIFIED AS ONE OF THE CLASSES. 
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say <1/8, then there is a range of q for which the probability of choosing 
the second class is approximately the correct value of 0.5. If a/cos^ is 
<1/2, then there is an approximately linear relationship .between -p and- q: 

? % 1/2 + (1/2 - q) Cl/2 - *(cr/cos*)] (79) 


where 


<Kx) = 




dz 


(80) 


Let us now consider the usefulness of adaptive processing. When the 
criterion is minimum error rate, it appears that adaptive processing will 
be most useful when there are slowly varying changes in the means and the 
covariance matrices are not large compared to the separation of 'the' means . 

For small deviations in estimating the means the error rate will be small 
so the adapting can correct the deviations. Thus if- the means change 
slowly, the adaptive processing can track the means. ’If the means are not 
changing, the adaptive processing will find the correct means, but there 
will be little effect on the error rate. For large deviations, the error 
rate will be greater, and either the adaptive processing will respond 
slowly, but correctly, or capture will occur. 

When the criterion of excellence is total acreage, a different picture 
emerges, at least for LANDSAT data. The linear relationship will hold for q, 
so that any percentage of recognitions can be obtained. If q is incorrect, 
then the farther <f> is from the correct value the better the acreage estimation. 
It appears that average error rate is a better criterion for evaluating the 
usefulness of adaptive processing, even though small changes' in’ the decision 
rule parameter may not be noticeable. 
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4.2 PROCESSING WITH INCORRECT MEANS 

One of the advantages of adaptive processing is that it may enable 
the use of signatures that are somewhat in error. To test this concept, 
we tried different signatures, with and without adapting, on LANDSAT data. 
The fields were divided so that one half of the fields were test fields 
and the remaining fields were training fields. Both sets of fields were 
distributed through the scanned area. For each test result we increased 
or decreased all signature mean vectors by the same percentage and computed 
the percent correct recognition. The results are shown in Figure 7, which 
shows the percent correct recognition as a function of percent change in 
the means- for both adaptive and non-adaptive processing . The means can 
vary approximately 20% with adaptive processing without degrading classi- 
fication accuracy below that obtained with the measured means and non- 
adaptive processing. For 90% processing accuracy, the means can vary 
either 9% or 23%, depending upon whether non-adaptive or adaptive process- 
ing is used. For 85% accuracy, these numbers change to 19% and 29%. When 
we used very large percentage changes, neither method produced acceptable 
accuracy, although the non-adaptive processing would be preferred because 
capture cannot occur. 

A visualization of the effects of adaptive processing can be seen in 
Figure 8, obtained from the same data and signatures as were used for 
Figure 7 . The means of the ground— truthed fields were computed and compared, 
for each field, with the signature mean. Of course, without adapting, the 
signature mean is fixed while with adapting the mean is modified as the data 
changes. These differences are shown as a function of the beginning line 
number (indicative of the order of processing) for both non-adaptive and 
adaptive processing. For' adaptive processing, the field means are centered 
around the true means, although there is a significant deviation which is 
caused by inter— field variations within each class . _ -For, non-adaptive pro- 
cessing, the field means are not centered around the signature means. 
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TEST FIELDS FROM N. DAKOTA ERTS DATA 

FIGURE 7: CLASSIFICATION ACCURACY WITH DECISION RULES DERIVED FROM 


INCORRECT MEANS 




FIELD BEGINNING LINE NUMBER 




N. DAKOTA ERTS DATA 
+15% Error in all means 

FIGURE 8: DIFFERENCES BETWEEN FIELD-CENTER MEANS AND DECISION-RULE MEANS 

WITH AND WITHOUT ADAPTING. 
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One can see the ‘ manner in which the signature means are updated in 
Figure 9- which shows the mean values in channel 2 (T, ANUS AT channel 5) 
for four classes. The values shown for line 1350 are the values obtained 
. from the signatures. Note the gradual change in values as the line number 
increases. A slightly different set of curves is shown in Figure 10 which 
was obtained by- changing the Kalman filter program so that three auxiliary 
fields would be used. The stars indicate the auxiliary fields, which are 
on the mean curves because the ^program is written so that the means coincide 
with the means of the auxiliary fields at the time the fields are being 
processed . 

4.3 PROCESSING TIME REQUIRED FOR ADAPTIVE PROCESSING 

An important consideration in adaptive processing is the additional 
processing time that the adapting requires. In Table 1, we compare pro- 
cessing times and Accuracies for various operating modes. The preprocessing . 
that was used was our standard scan angle correction program named ACORN. 
There is an approximately 11% processing time increase when adapting is used-, 
which would reduce- to approximately 3% had a quadratic decision rule been 
used rather than a linear decision rule. There is also an increase in 
processing time when we adapt the angle correction. For this data set, 
it appears that we gain by adapting the means, but adapting the angle 
correction is not as good as using ACORN. 

4.. 4 EFFECTIVENESS OF MODIFICATION FOR COLORED NOISE 

The algorithm considered here was described in detail in Section 3.4. 

It differs from the original adaptive processing algorithm principally in 
that it assumes that- variations for different sampling times of the class 
mean vectors are correlated. A correlation length associated with this 
correlation is an input parameter to the program. Thus it may be set to 
approximate the average field length for the data being processed. 

Programming of this algorithm was completed and tests were made to 
confirm the correctness of the -program- and- the characteristics of- the 
algorithm. 
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NORTH DAKOTA ERTS DATA CHANNED'2 MEANS 

FIGURE 9: VARIATIONS OF UPDATED MEANS WITHOUT USE OF AUXILIARY FIELD 
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FIGURE 10: VARIATIONS Or UPDATED MEANS WITH USE OF AUXILIARY FIELDS 



ALTERNATIVE CLASSIFICATION METHODS 


Adapting with Scan 

No Adapting No Adapting Adapting Adapting Angle Correction 

No Preprocessing Preprocessing No Preprocessing Preprocessing No Preprocessing 


% CORRECT 
RECOGNITION 

83.4 

86.9 

85.7 

90.6 

86.1 

EXECUTION 
TIME (min.) 

7 

11 

8 

12 

13 


Figures based on classification into 5 classes using 5 channel aircraft data 


TABLE 1 
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For these tests ERTS data from N. Dakota were used. This data set 
has been' used extensively in tests- for the original Kalman filter program. 
Fields in this data set are generally larger than those in the CITARS data- 
sets, although probably not as large as could be found in major wheat grow- 
ing data sets . 

Storage requirements for the correlated state variation program are 
greater than for the original program. The state vector is twice as long 
(40 elements) and the error matrix 4 times larger (5x5 elements) . This 
has not caused any problems. Storage requirements of the new program are 
well within the limitations of the 7094 computer (32 K words) . Total pro- 
cessing time for the test data set was virtually the same as with the 
original adaptive program. 

Figure 11 shows classification accuracy vs, 0^ for both the original 
algorithm and the correlated state variation version. These figures are 
based on classification of 35 fields into 5 classes. Also shown for com- 
parison is the accuracy obtained with conventional non-adaptive linear rule 
classification . 

It can be seen that the new algorithm actually attains a slightly 
higher accuracy than the original, but at a much smaller value of 6^ 

This difference is in agreement with the theory presented in Section 3.4 which 
predicts that 0^ for the new algorithm should be smaller by a factor of 
1-ot 2 to obtain the same updating rate. For this case 1-ct 2 % .02. 

Figures 12 and 13 show the effect .of changing the correlation length 
parameter. In both figures the lower plot shows the change in the mean 
vector for a particular class and channel as a function of line number. 

The upper plot shows the change in the corresponding element of w in 
equation 55 . 

In Figure 12 the value of n was 30. In Figure 13' n is increased to 

o o 

90. As expected, the mean changes more rapidly in Figure 13 due to the 

higher effective updating rate caused by the higher value of n . 

o 
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4.5 ADAPTIVE PROCESSING LIMITATIONS 

Further tests have been conducted to evaluate the performance of 
adaptive processing. For these tests both conventional linear rule 
classification and adaptive classification were performed on the same 
data set using signatures extracted from the data set being processed. 

Adaptive processing periodically updates the signature means on the basis 
of previous classification decisions to account for inter- and intra-field 
variations . Conventional classification uses the -same means throughout the 
data set. To compare the recognition accuracy of the two methods, average 
percent correct recognition for a number of fields of known crop type was 
computed . 

Previous comparisons of adaptive and conventional processing using 
LANDSAT data have shown that adaptive processing is capable of reducing 
classification errors by as much as one— third. The tests described below 
offer further confirmation of another previously reported result. That is, 
if two or more signatures are similar enough so that confusion exists 
between' them with conventional classification, adaptive classifying will 
give even poorer results. However, if the signatures are reasonably dis- 
tinct, adapting can generally be expected to improve classification accuracy. 

Two LANDSAT data sets from the CITARS project were used for this testing. 
Fayette Co . , Illinois data from 21 August and White Co . , Indiana data from 
the same date were processed. 

Table 2 shows the results of processing these data sets using both 
conventional and adaptive classification. The percentages shown are based 
on 125 fields from Fayette Co. and 166 fields from White Co. Signatures for 
six materials were obtained from the CITARS project. 

TABLE 2. % CORRECT RECOGNITION WITH AND WITHOUT ADAPTING FOR 

FAYETTE AND WHITE CO. CITARS DATA 


FAYETTE CO. 
NO ADAPTING 

FAYETTE CO. 
ADAPTING 

WHITE CO. 
NO ADAPTING 

WHITE CO. 
ADAPTING 

CORN 

87.4 

89.5 

72.0 

71.2 

SOY 

85.5 

85.5 

77.7 

77.7 

OTHER* 

68.7 

66.8 

90.4 

90.4 

* 

Other 

includes trees 

, bare soil. 

clover , and weeds . 
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The White Co. data is essentially unchanged by adapting. There is a 
slight (<1%) reduction in classification accuracy with adapting. 

The Fayette Co. data show approximately 2% increase in accuracy of corn 
recognition with adapting and a corresponding 2% decrease in accuracy of 
the "other" class (trees, bare soil, clover and weeds). 

An increase in recognition of one class and a corresponding decrease 
in another is characteristic of signature capture during adaptive processing. 
For these two data sets there is sufficient similarity in the signatures to 
cause capture. As has been noted previously, this precludes the successful 
use of the present adapting algorithm. 

Capture can also occur with reasonably separated signatures if too 
rapid an updating rate is used. Figure 14 illustrates this. The solid line 
shows total recognition accuracy vs. 0^ for Michigan LANDSAT data classified 
into 5 classes and representing 107 fields from Ionia and Clinton Co. The 
dashed line shows the percentage of corn correctly classified. Maximum 

_Q n 

accuracy is obtained with 0 in the range of 10 to 10 . Classification 

accuracy decreases for larger values of 0 (faster updating). 

The dashed line shows a major cause of this decrease in accuracy. 
Virtually all of the corn pixels which were not classified as corn were 
classified as senescent vegetation, a class which included grass, field beans 
and alfalfa. The senescent vegetation signatures are close enough to the 
corn signature to cause just over 35% of the total corn pixels to be 
incorrectly classified as senescent vegetation even with small 0^. . This 
causes the senescent vegetation means to be updated such that these signatures 
capture even more corn pixels. The higher the updating rate the more 
complete this capture process becomes. For 6.^ = 10~ 7 the senescent vegetatioi 
signatures have captured over 90% of the corn pixels. 

Tables 3 and 4 show the classification results in greater detail for 
the two extremes of 0^. The left hand column' gives the class name. This 
is -followed by the number of fields and number of pixels actually belonging 
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TABLE 3. 

CLASSIFICATION RESULTS WITH 6 = 10~ 7 




IONIA AND 

CLINTON 

CO. LANDSAT DATA - 


CLASS 

■NR. 

PLOTS 

NR. 

POINT 

SIGNATURES. . 

SOY BARE 

CORN BEANS TREES SOIL 

SENESC 

VEG. 

CORN 

37 

297 

7.1 

,3 

92.6a 

SOYBEANS 

6 

27 

3.7 

66.7 

29.6 

TREES 

6 

47 

38.3 

19.1 

42.6 

BARE SOIL 

10 

53 


100.0 


SENESC. VEG. 

48 

260 


2 *7 . 18.5 

78.8 


107 

684 





TABLE 4. CLASSIFICATION RESULTS WITH 0 = 10 -10 

1 

IONIA AND CLINTON CO. LANDSAT DATA 


SIGNATURES, . 


CLASS 

NR, 

PLOTS 

NR. 

' POINT 

CORN 

SOY 

BEANS 

TREES 

BARE 

SOIL 

SENESC 

VEG. 

CORN 

37 

297 

64.0 


.3 


35.7 

SOYBEANS 

6 

27 

3.7 

85.2 



11.1 

TREES 

6 

. 47 

40.4 

2.1 

51.1 


6.4 

BARE SOIL 

10 

53 




100.0 


SENESC. VEG. 

48 

260 

.8 

4,6 


12.3 

82,3 




AQ 


107 


684 
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to that class. The remaining columns form a matrix showing what percentages 
of this total were assigned to the various- classes. 

Recall that the updating rate, as developed in [1} is defined as 
follows for updating after every point. Assume that a sufficient number 
of aero measurement vectors have been sampled so that the estimate of the 
first mean is the zero vector.' Sow, if the measurements become vectors 
composed entirely of the numbers, 1, the updating rate is the number of 
updates required for the mean estimate to become 1 -e . This number of 
updates has been shown to be approximately * A mOEe useful £orm is 

to express this in terms of lines rather than updates. The present program 
updates after every 1/3 scan line, each of which contains N points. If we 
let the variance -of each state variable be N0 1 because the state variatxon 
is assumed to occur after each line rather than after each update, then the 
updating rate becomes ~~ZZ U ~ es > for M materials. Updating rates on 

the order of tens to hundreds of lines proved to be suitable for LANDSAT 
agricultural data. 

Table 5 shows updating rate information for data sets from 4 different 
LANDSAT frames representing different geographical locations. In each case 
the value of 9^ shown is the value which resulted in the maximum percent 
correct recognition for all known -fields in the processed area. Parameters 
not shown were held constant while processing all 4 data sets, 

lor the North Dakota and Kansas data the updating rate is of the same 
order as the size of typical fields found in the scene. The Michigan and 
Illinois data required slower updating to avoid capture since the materials 
being classified were more easily confused than for the other data sets. 
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TABLE 5 

OPTIMUM UPDATING RATES FOR 4 DATA SETS 


LOCATION 

NUMBER OF 
MATERIALS 

NUMBER OF 
POINTS PER LINE 

fl 

UPDATING 
RATE (LINES) 

North Dakota 

5 

650 

5xl0 -8 

2*6 

Kansas 

3 

101 

10“ 5 

6 

Michigan 

7 

310 

CO 

l 

o 

t-i 

72 

Illinois 

6 

126 

-9 
10 3 

383 


There is another parameter, 4> o , that can have an effect on the 
classification accuracy, 4> Q is the initial value of the state error matrix 
in simplified form. It represents an estimate of the starting error in the 
material means. In normal use, the starting means are obtained from fields 
near the beginning of the area to be processed. In this case ^ may be 
assumed to be aero. However, if it is known that the starting means are 
nor representative of the area being classified, then a non-zero $ is 
appropriate. This situation might arise when means from one area are used 
to classify another geographically distant area. Thus can be used to 
introduce a transient updating rate that is different from the steady-state 
rate. 

Table 6 illustrates the use of non-zero values of <f> when the means 

o 

being used are not representative. Four classification r-uns were made on 
a data set from the North Dakota LANDSAT frame. For the first two runs, 
signatures were used which were obtained from the data set being classified. 
The runs differed only in having different values of 4> q . For the second 
two runs the signature means were increased by 5% to make them intentionally 
non-representative. Again two different values of were used. 

Table 6 shows percent correct recognition for the four runs , It can ' 
be seen that a zero value of 4 0 is preferable when the correct means are 
used but a non-zero q> 0 is preferable when the means are not correct. 
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TABLE 6. EFFECT OF TRANSIENT UPDATING RATE 
MEAN CHANGE ' ” 

0. 92.7 

+5 88.1 

Another test of adaptive vs. conventional processing with CHARS data 
involved the use of signatures for Fayette Co., data of 11 June to classify 
Fayette Co, data of 10' June, The 11 June signatures were adjusted, using 
■a MASC (Multiplicative and Additive Signature Correction) transformation 
before being used to classify the 10 June data. The MASC transformation 
technique is explained in detail in [2], The details of the process are 
not important to this test, however, since we are using the same 
signatures with both classifying techniques. What is significant is the 
choise of materials in this particular signature set. Signature for 
wheat, water, trees, bare soil, and weeds were included. These signatures 
are not as easily confused as the ones used in the test described above, 
so better results would be expected using adapting. 

Table 7 shows the percentage of pixels correctly classified by 
conventional processing and' adaptive processing for four classes. Three 
sets of figures are given for adaptive processing corresponding to different 
updating rates. Adaptive processing with 6,^ = 10~ 8 or 10~ 9 gives results 
somewhat better than conventional processing. 


<P = 10 
-o 


-2 


90.4 

92.9 
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TABLE 7. % CORRECT RECOGNITION WITH AND WITHOUT 

ADAPTING FOR FAYETTE CO. CITARS DATA 


ADAPTING 



NO ADAPTING 

6^=10~ 7 

e^io" 8 

6 1 =1 ° 9 

WHEAT 

93 

86 

92 

93 

WATER 

95 

97 

97 

97 

TREES- 

71 

66 

69 

72 

OTHER 

88 

91 

90 

88 

AVERAGE 

86.8 

85.0 

87.0 

87.5 


In another test, we tried adaptive processing on a data set for which 
the classification accuracy had been poor. The main cause of the poor accuracy 
was the classification of 38% of the corn data points as another class, 
namely trees. We hoped that we could increase the classification accuracy 
by inserting into the filter a large correlation coefficient between 
the corn and trees means . 

The data set was collected in the same general area as that used for 
the surface water test. The classification accuracy with normal processing 
was 62%. With accuracy this poor, we would not expect that adaptive 
processing would be useful. Indeed, the accuracy was reduced to 
51%, with 70% of the corn data misclassif ied as trees. Figure 7 provides 
further confirmation of the result that when normal processing has poor 
accuracy, adaptive processing has poorer accuracy. It -is also worth 
repeating the converse, that when normal processing has high accuracy, 
adaptive processing improves the accuracy. 
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5 

CONCLUSIONS AND RECOMMENDATIONS 


The Kalman filter approach to adaptive processing appears' to be useful 
for classification of LANDSAT data. The approach fails when the data are 
difficult to classify, because the successful operation of the filter 
depends upon the correctness of the decisions. However, when the decisions 
tend to be correct, each data point is used to update the proper mean 
vector which improves the decision rule, thereby improving classification 
accuracy. 

The use of auxiliary fields appears .to be useful when processing entire 
LANDSAT frames. When small portions of frames are to be processed, e.g,, 
LACIE, then there is a choice between using ground-truthed fields as 
auxiliary fields or for signature extension. Tests should be conducted to 
determine which use of the fields is preferable. 

The increases in computer processing time and memory requirements that 
adaptive processing causes are probably not significant. Ther e are also 
negligible penalties in processing time and memory when using the colored 
noise modification. 

For the limited amount of testing that has been performed, all of the 
Kalman filter modifications appear to perform the function for which they 
were designed. Exactly which modification should be used for a specific 
task should be determined for that task by additional testing. The testing 
that would be most appropriate at. this time would be for the LACIE project. 
Let us now consider how such a test program might be conducted, 

LACIE data would be used to perform experiments designed to test the 
following hypothesis: given a data set on which' conventional linear rule 

recognition processing gives reasonably good classification accuracy, a 
decision-directed Kalman filter adaptive, classifying algorithm will provide 
more accurate classification. More specifically we would compare the 
performance of conventional and adaptive processing in three different 
applications. 
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First, local recognition: For this case one or more intensive study 

sites would be classified using signatures extracted from the same site. 
Second, non-local recognition: In this case, signatures from one site 

would be used to classify a second site. Third, signature extension: 

Here, a 'large area containing two intensive study sites would be processed. 
These area would be selected so that one study site is near the beginning 
of the data to be processed and the' other is near the end. Signatures 
would be obtained from the first site and classification accuracy would 
be determined for the second site. 

For each of these tests we must consider the method of evaluation to 
be used and the selection of training and test areas. The evaluation 
technique would be the same for all three tests. For both conventional 
and adaptive processing the total acreage of the materials classified as 
well as percent correct recognition for all individually identified 
fields would be found, 

For the local recognition tests, site selection is not critical. 

Any of the intensive study sites could be used. Training fields . should 
be selected from the beginning of the site since the adapting algorithm 
assumes that the initial error in the means is zero. The study site could 
be divided into training and test sections at some arbitrary point 
leaving, for example, the first third of the site for training and the 
rest for test. Within these areas all fields large enough to contain 
at least as many field center (pure) pixels as data channels would be 
identified. 

Non-local recognition tests would be restricted to the use of two or 
more intensive study sites found on the same LANDSAT frame. Four such 
instances are listed below: 


FRAME 


DATE INTENSIVE STUDY SITES 


1457-16551 

1635-16395 

1689-16382 

1725-16371 


23 October 
19 April 
12 June 1974 
18 July 


Finney & Morton 
Saline & Ellis 
Saline & Ellis 
Saline & Ellis 
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The most stringent requirements are presented by the signature 
extension test. Here we need two study sites on the same quarter frame 
(tape) so that 'processing, can be carried on uninterrupted from one site 
to the next. No quarter frame exists which contains any two of the 
Kansas sites. The situation for the Texas sites is not known to us since 
we do not yet have the tapes. 

If no suitable tape contains two intensive study sites it may be 
necessary to use one study site and one SRS area for this test. SRS 
areas are smaller and less accurately examined than the intensive 
study sites but there are more of them. 

We should add one final thought. The adaptive processing techniques 
discussed in this report have been tested using a multi-class decision 
rule. The techniques should apply directly to the LACIE decision rule, 
which is one that uses multiple signatures to form a two-class ratio test. 
When the ratios are formed, all of the quadratic functions needed for a 
multi-class decision rule are computed. The only computer functions that 
would be required to make a multi-class decision, which would be required 
when adaptive processing is to be used, are additions and amplitude 
comparisons. Consequently, the advantages of adaptive processing could be 
realized with only a small increase in processing time, probably less than 
a five percent increase. 
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